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ABSTRACT 

The purpose of this paper is to simultaneously 
optimize decision rules fcr combinations of elementary decisions. As 
a result of this approach- rules are found that make more efficient 
use of the data than does optimizing those decisions separately. The 
frame jrork for the approach is derived from empirical Bayesian theory. 
To illustrate the approach, two elementary decisions — selection and 
master:' decisions — are combined into a simple decision network. A 
linear utility structure is assumed. Decision rules are derived both 
for quota-free and quota-restricted selection-mastery decisions for 
several subpopulations. An empirical example of instructional 
decision making in an individual study system concludes the paper. 
The example involves 43 freshmen medical students (27 were 
disadvantaged and 16 were advantaged with respect to elementary 
medical knowledge). Both the selection and mastery tests consisted of 
17 free-response items on elementary medical knowledge with test 
scores ranging from 0 to 100. The treatment consisted of a 
computer-aided instructional program. Three data tables and three 
figures are provided. (Author/TJH) 
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Abstract 

The purpose of this paper is to simultaneously optimize 
decision rules for combinations of elementary decisions. As a 
result of this approach, rules are found that make more 
efficient use of the data than optimizing these decisions 
separately. The framework for the approach is derived from 
(empirical) Bayes theory. To illustrate the approach, two 
elementary decisions (viz. selection and mastery decisions) 
are combined into a simple decision network. A lineaur utility 
structure is assumed. Decision rules are derived both for 
quota— free and quota-restricted selection-mastery decisions 
in case of several subpopulations . An empirical example of 
instructional decision making in an individual study system 
concludes the paper. 
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Introduction 

Decision problems in educational and psychological testing 
can be classified in many ways. An elegant typology of test- 
based decisions has been given in van der Linden (1985. 
1988). Each type of decision making in this typology can be 
viewed as a specific configuration of three basic elements, 
namely a test, a treatment, and a criterion. In general, the 
following four different types of decision problems can be 
distinguished : selection . mastery . placement . and 
classification. 

Educational applications of the four types of decision 
making can be found in such fields as the admission of 
students to schools (selection). pass-fail decisions 
(mastery), the aptitude— treatment-interaction paradigm in 
instructional psycholog/ (placement), and vocational guidance 
situations where most promising schools must be identified 
(classification) . 

In Hambleton and Novick (1973). Huynh (1976. 1977). 
Mellenbergh and van der Linden (1981). Novick and Petersen 
(1976). Petersen (1976). Petersen and Novick (1976). van der 
Linden (1980. 1961. 1987) and Vos (1988). these elementary 
decision problems have been studied extensively; these 
authors also indicate how - analytically or numerically — 
optimal decision rules can be found using (empirical) 
Bayesian decision theory. 

The four elementary decisions can be met both in their 
pure forms or in combinations with each other. The latter is 
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the case, for lnsta.ice. In test-based decision making in 
Individualized study systems (ISS's), which can be conceived 
of as networks consisting of these various types of decisions 
as nodes (Vo3 & van der Linden, 1987). In such systems 
decision making can be viewed as proceeding students through 
a network of several of the elementary decisions. 

The purpose of this paper is the simultaneous 
optimization of combinations of elementary decisions using a 
decision— theoretic approach. Compared with separate 
optimization of elemer ary decisions, two main advantages can 
be Identified, ^irst. rules making more efficient use of the 
data can be found. Second, utility structures can be made 
more realistic. In order to illustrate the approach, in this 
paper a selection and a mastery decision will be combined 
into a simple decision network, ard it will be indicated how 
optimal rules for guiding students through such a system can 
be derived. The first advantage of the simultaneous approach 
is illustrated using this simple system. For Instance, when 
optimizing acceptance-rejection rules in the combined 
decisioa network, pass-fail decisions to be made later can 
already be taken into account (see also Fig^ire 2). The second 
advantage will be explained after the utility function for 
the combined decision has been specified. 

For each elementary decision, one or wore of the 
following restrictions may apply (van der Linden. 1988); 
( 1 ) Multiple populations . The problem of culture-fair 
decision making may arise because of the presence of 
subpopulatlons reacting differently to the test items, e.g. 
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for populations defined by race or sex. In such a case, the 
test Items are often be assumed to be "biased" against some 
of the populations. 

(2) Quota restrictions . For some treatments, due to shortage 
of resources, the number of vacancies are constrained. 

(3) MultivariAte test data . The decisions are based on data 
from a whola test battery Instead of a single test. 

(4) Multivariate criteria . The success of the treatments is 
measured by multiple criteria. 

In the present paper, only restrictions will be made 
with respect to the presence of subpopulatlons and the number 
of students to be accepted for some treatments. First. th<; 
problem of culture-fair decision making will be considered 
for a quota-free selection problem. Next, optimal rules will 
be derived for quota-restricted selection problems using 
methods of constrained optimization. The final section 
presents some empirical examples of optimal cut-off scores 
for quota-free as well as quota— restricted selection-mastery 
decisions for two subpopulatlons referred to as the 
disadvantaged and the advantaged populations. 

Statement of the Problem 

As noted before, a well-known example of coinbinations of 
elementary decisions in education is an individualized 
instruction system. Figure 1 shows a flowchart of a system in 
which a selection decisior is followed by a treatment, here 
an instructional module. Then a mastery decision follows. 
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after which a placement decision assigns the students to two 
different routes through a module both leading to the same 
learning objective. Rdal-llfe ISS's often have more decision 
points . 



Insert Figure 1 about here 



Selection-mastery decisions may occur In an ISS. for 
Instance, when decisions on the admission of students to the 
system should be made. Then a selection test Is administered 
before the treatment takes place and students promising 
satisfactory results on the criterion are accepted for the 
first module of the Instructional program (see Figure 2). 
Furthernore. let us suppose that the criterion Is unreliably 
measured, which is not uncommon In ISS's. If success on the 
criterion Is measured by a threshold value separating 
"masters" from "nonmasters" . then. In fact, after the 
treatment a mastery decision has to be taken, and the problem 
Is a selection-mastery decision problem. Students who have 
reached the module objectives may proceed with the next 
module. However, students who failed are provided with 
supplemental Instruction, extra learning time, corrective 
feedback, and the like. These students have to prepare 
themselves for a new mastery test. 
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Insert Figure 2 about here 



In the following, shall suppose that In the 

selectlon-^nastery decision problem g ig ^ 2) subpopulatlons 
reacting differently to the test Items can be distinguished. 
Furthermore. It Is assumed that the observed selection test 
score variable X. the observed mastery test score variable 7. 
and the true score variable T underlying Y, I.e. the 
criterion score, assume only continuous values. Formally, the 
presence of populations reacting differently to test items 
Itnplles different cut— off scores for each population. 
Therefore, let x^^^ and y^i denote the cut-off scores for 

subpopulatlon 1 (1 = 1,2 g) on the observed test 

score variables X and Y, respectively. However, the cut-off 
.-^core t^, on the criterion score T Is assumed to be equal for 
each population and Is set In advance by the declrlon-maker . 
The combined decision problem can now be stated as choosing 
values of Xd and y^i that, given the value of t^, are 
optimal In some sense. 

In the present paper, linking up with common practice In 
criterion-referenced testing, we consider only decisions In 
which the decision rules 6 have a monotone form: students are 
admitted to a treatment If their test score Is above a 
certain cutting point and rejected otherwise. They can be 
defined for our example In the following way: 
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(1) 



6(1 , Y) = 



for X < X 



ci 



*or X ^ x^^, Y < y^^ 
for X ^ x^^, Y ^ y^^, 



where ag , a^^ , and a2 stand for the actions to reject a 
student, to retain cm accepted student, and to advance an 
accepted student, respectively. 

in appropriate framework for dealing with decision 
problems such as the above is (eaqpirical) Bayesian decision 
theory (e.g., DeGroot. 1970; Ferguson, 1967; leeney & Raiffa, 
1976; Lindgren. 1976). Besides the actions, probabilities and 
utilities are two other fundamental elements in a Bayesian 
procedure. In case of an ISS, a probability model predicts 
the outcomes of the several possible routes for the students, 
and a utility structure evaluates the outcomes predicted. The 
optimal procedure as prescribed by Bayesian decision theory 
is to look for a decision rule that maximizes expected 
utility. 

With respect to the first element, it will be assumed 
that for each population i, the probability function 
ni(x,y,t) of the Joint distribution (X.Y,T) is available. 
Note that . due to the presence of different populations 
reacting differently to test items, different probability 
functions for each population should be assumed. 

Also, the decision-medcer may have different utilities 
associated with different populations. Hence, in addition to 
separate probability distributions, the decision-maker has to 
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specify explicitly his/her utility function for each 
subpopulatlon separately. 

The utility structure dealt with In this paper Is a 
linear function of the criterion varlaMe T. w^'"*h seems to 
be a realistic representation of the utllloj ^ s actually 
Incurred In mrny decision making situations . In a recent 
study, for Instance. It was shown by van der Gaag (1987) that 
many eooplrlcal utility structures could be approximated by 
linear functions. 

Monotonlclty Conditions 

As mentioned before. In a declslon-the' etlc approach, 
optimal decision rules are found by optimizing expected 
utility. However, the restriction to monotone rules In our 
paper Is only correct If there are no nonmonotone rules with 
higher expected utility. It Is here that the notion of an 
essentially complete class of decision rules comes In handy. 
An essentially complete class Is defined as a class of 
decision rules as good as rules outside this class (e.g. , 
Ferguson. 1967. p. 55). 

In case of separate elementary decisions. the 
monotonlclty conditions are known (Ferguson. 1967. sect. 6.1; 
Earlln & Rubin. 1956). Two conditions have to be met: First 
the probability model relating observed test score Z to true 
score T should have a monotone likelihood ratio (MLR). I.e. 
It Is required that for any t2 < t^. the likelihood ratio 
f (z|tj^)/f (z|t2) Is a nondecreaslng function of z. Second, the 
utility function should be monotone; that Is. the actions 
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should be ordered such that for each two adjacent actions the 
utility functions have at most one Intersection point. If 
these conditions are met. a monotone solution Is said to 
exist. It should be noted that for the classification problem 
these conditions do not hold without modifications (van der 
Linden. 1987). 

To guarantee that the monotone rule of the combined 
decision problem belongs to an essentially complete class, 
the following extra condition (Lehmann. 1959. sect. 3.3) 
should hold: 

(2) For any t2 < t^ . the likelihood ratio 
k(x.y |t2)/lc(x.y |t2) Is a nondecreaslng function In 
each of Its arguments; that Is. for any t2 < ti and 
fixed values of Y = yg and X = xq . the 
likelihood ratios k(x . yo I ) /'kix, yo I ^2 ) and 
k(xo.y |ti)/k(xo.y |t2) are nondecreaslng functions of 
X and y. respectively. 

It will be shown below that. In addition to the 
conditions of MLR and monotone utility, condition (2) Is 
sufficient for a monotone solution to exist for the combined 
dec' ^lon problem. The condition of monotone utility Is 
elaborated In the next section. 
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Linear Utilitf Function for a 
Selection-Mastery Decision 



Generally speaking, a utility function eT^aluates the total 
consequences of all possible decision outcomes. Formally, it 
is a function Ujj^(t) that describes the utility incurred when 
action aj (J = 0,1,2) is taken for the student from 
subpopulation i whose true score is t. 

Mellenbergh and van der Linden (1961) and van der Linden 
and Mellenbergh (1977) use a linear utility function for 
determining optimal cutting scores on the separate decisions. 
Here* their function is restated for the combined decision 
problem as a linear function in T -for subpopulation i (see 
also Figure 3 ) : 



(3) ^ji(T) = 



boi(tc-t) + dg^ 
bj^(t-cc) + dj^ 
b2i(t-tc) + dgi 



for X < X 



ci 



for A ^ x^^, Y < y^^ 
for X ^ x^^, Y ^ y^^. 



where boi» b2i > 0. 

For each action aj (J = 0,1,2), this function consists 
of a constant term and a term, proportional to the difference 
between the criterion perfc rmance t of a student and the 
minimum level of satisfactory criterion performance t^. The 
parameters dQj^, di^^, and d2i can represent, for example, the 
costs of testing or the cost of following an instructional 
module. The condition bgi, b2i > 0 is equivalent to the 
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statement that for the rejected students and the accepted 
students who passed the mastery test, utility Is a strictly 
decreasing and Increasing function of t, respectively. 

It should be noticed that It cannot be said beforehand 
whether the utility asscrlated with action a^. I.e. u^^Ct). 
Is Increasing or decrea& 'ng, because the utility of the 
combined decision depends on the utilities associated with 
the selection as well as the mastery decision. Depending on 
either the Influence of the utility associated with the 
acceptance or with the fall doclslon Is the most Icnportant, 
uii(t) Is an Increasing or decreasing function of t. 
respectively. Figure 3 displays an example of a combined 
linear utility function for b^^ > 0. 



Insert Figure 3 about here 



In the Introduction. It was remarked that one of the 
main advantages of a simultaneous approach was that more 
realistic utility structures coul<* be used. Formula 3 nicely 
demonstrates how a utility function defined on the ultimate 
criterion of the ISS ("master" or "nonmaster") can be 
properly brought Into a previous decision (selection 
decision) . 

Gross and Su (1975) pointed out that 'fair' selection Is 
a question of utilities. Whether a selection procedure Is 
believed to be fair to the various subpopulatlons which can 
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be distinguished depends on the utilities of those Involved 
In the selection process. From this point of view, the linear 
utility model can be used to allow for the fact that the 
students might belong to a disadvantaged or advantaged 
subpopulatlon by choosing separate parameter values for the 
subpopulatlons Involved. Suppose, for example, that 
subpopulatlon h Is considered more advantaged than 1. In 
choosing values of the parameters of the linear utility 
fiinctlon ohls can be taken Into ."Account by requiring that 
Incorrect decisions are considered worse for subpopulatlon 1 
than for h, while correct decisions are considered more 
valuable for 1 than for h. This amounts to choosing values of 
the .lope parameters such that bQj^ > hQ^ and b2j^ > b2h for 
all t. Since bj^j^ > 0 Implies that the Influence of the 
utility associated with the acceptance decision Is the most 
Important, It will hold that b^^ > b^jj for b^^, b^jj > 0. 
Following the same line of reasoning. It is required that 

^11 < ^ih if ^11- < 0. 

The possible actions are supposed to be ordered as ag^ 
aj[. and a2. Using the fact that, as can be seen from Figure 
3. the difference between the utilities change si' a precisely 
once, the condition of monotone utility for the utility 
function defined by Formula 3 can be expressed as 
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(4) Uj^(t)-Uo^(t) = (bj^+bo^)(t-tg) 

* dli^Ol > 0 ^ > ho.i 
Ui^(t)-Uo^(t) = (bii+bo^)(t-t^) 

* **11-^01 < ° tOT t < tjQ^, 



(5) U2i(t)-Uj^(t) = (b2l-*'ii5(*-*c' 

* **21-^11 > ° * ^ *12,1 
"2l^^'-"ll^*^ = ^'=2rtl'^*-*c' 

* ^2i-^n < 0 for t < tj2,i. 



where tjo.i and ti2,i (tio.i S ^12,1) denote the T 
ccordtaates tf the lotersectlon of utility line uj^Ct) with 
uoi(t) and U2i(t), respectively. Furthermore, It Is assumed 
that the functions uii(t)-uoi(t) and U2i(t)-uii(t) are 
strictly Increasing functions of t. Implying that the slope 
parameters (bji+bQi), (b2i^ii) > 0. Using the fact that bgi. 
b2i > 0, this means that the following condition should hold 
for the utility parameter b^^: 

(6) b2^ > bj^ , If bj^ > 0 
^01 " -''ll • ^11 < 0- 
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Optimal Cutting Scores for Quota-Free Selection 

In this section, optimal cutting scores are derived for the 
coaibined decision problem in case of quota-free 'ieU>ction. 
That is, we are looking for pairs of cutting scores (xci.Yci) 
such that the overall expected utility is a maximum. 

Overall Expected nullity 

In maximizing overall expected utility, first the expected 
utility of a random student from the ith subpopulation will 
be calculated, which, as monotone solutions are looked for. 
can be written as 



(7) E[Ui(T|Xci.yci)] = /f£M^Iboi(tc-t)+doilWi(^'^)<^^<^ + 



where w^Cx.t) is the joint probability function of Z and T in 
subpopulation i. Let Ei(T|x). qi(x). ki(x.y). and Ei(T|x,y) 
denote the regression function of the criterion variable T on 
X. the probability function of X. the joint probability 
function of X and Y. and the regression function of the 
criterion variable T on X and T in subpopulation i, 
respectively, then (7) can be written as 




^^ci^Yci^-^ [b2i(t-tc)+d2i]ni(x.y.t)dtdydx. 
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(8) E[Ui(T|xci.yci)] = /!l{bDi^^c-Ei<T|x)]+doi}qi(x)dx 

+ iZ {[bOi+bii][Ei(T|x)^tc] 
*ci 

+ dii-doi>qi(x)dx 

^Xci^Yci^^^^^^^^ ^^^^'""'^'^^^ 
+ d2i-dii>ki(x,y)dydx. 



Now, the decision procedure is viewed as a series of 
separate decisions, each of which involves ono random 
student, and it follows that the overall expected utility is 
a weighted average of the expected utilities for the 
individual populations. Thus, overall expected utility of the 
combined decision problem is: 



g 



E[u(T|x y )] = E PiE[ui(T|x .y )] 

Cl ^1 j^—j^ Cl CI 



g 

where p^, ^^^Pi = 1- is the proportion of students from 
subpopulation i in the total population of students. 

In quota-free selection there is no restriction as to 
the number of students that can be accepted for the 
treatment. Therefore, Formula 9 is r^ximized if the expected 
utility of a random student is maximized. This is done by 
maximizing Foronila 9 for each subpopulation separately. 
The maximum of Ei[u(T|xci.ycO] now depends only on the 
second and third term in the .ght-hand side of (8), because 
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the first term Is Independent of Xq^ and y^i- Using a result 

from decision theory (see e.g., Chuang, Chen, & Novlck, 1981) 

stating that for any prior distribution of t, E(u(T|z)] is a 

nondecreaslng function of z if f(z|t) has MLR and u(t) Is a 

nondecreaslng function of t, and assuming monotonoclty 

condition (2), It follows from (5) that (U2i(t)-uii(t) |x,y] 

= (b2i-bii] (Ei(T|x,y)-t^,]«fd2i-d|[i Is a nondecreaslng function 

In each of its arguments. Since (^21-^10 ^ ^^^^ Implies 

that — Ei(T|x,y) and — Ei(T|x,y) > 0. Similarly, using (4) 

3x dy 
Instead of (5), It follows that Ei(uii(t)-uoi(t) |x] = 

[bji+boi] (Ei(T|x)-tc]+dii-doi Is a nondecreaslng function of 

d 

X. Implying that, since (bji+bgi) > 0, — Ei(T|x) > 0. Using 

dx 

qi(x), kj^(x,y) 2: 0, It follows now that the sign of the sum 
of the second and third term changes only once from negative 
to positive, and, therefore, E[ui(T|xci.yci)] will reach Its 
maiclmum for one pair of cutting scores (xd.yci)- 

Maximizing Expected Dtllltv for a Pandom Student 
Necessary conditions for the optimal values of the cutting 
scores, say x'^i and y*ci» optimizing the expected utility 
for a random student from subpopulatlon 1, Ei(u(T|xci.yci)] . 
can be obtalr-nS by differentiating Ei(u(T|xci.yci)] with 
respect to Xd and y^i. setting the resulting expressions 
eqnaal to zero, and solving simultaneously for x^i and y^,j^. 

Using the property that for any blvarlate distribution 
f(x,y). It holds that 
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-I r f(x.y)dydx = - ^ r r £(x.y)d7dx = f £(x.s)dx. 

03 X — OS Z S X 

For the derivative of Ei [u(T|xci .Yci)] ^^^^ respect to y^i 
this results in 

(10) g_L.Ei[u(T|x^,.y^^)] = 

= Si(y^^) /; {[bii^2i^f%^T|x.y^,)-t^]+di,^2i> 
ci 

^i^^lyci'*^ = 0' 

where Zj^(x|yci) and s^Cy) denote the posterior probability 
fiinction of X given Y = y^i and the mcurginal probability 
fimction of Y in subpopulation i, respectively. Since Sj^(y) ^ 
0 (the possibility of s^Cy) = 0 will be ignored) . it follows 
that (10) can be replaced by 

(11) r {[b2i-bji][Ei(T|x.y^^)^t^]+d2i^i>2^(x|y^^)dx = 0. 

ci 

Similarly, differentiating Ei[u(T|xci.yci)] with respect to 
x^i, using qi(x) 2: 0 (the possibility of qi(x) = 0 will also 
be ignored), results in 
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(12) tboi+^ll^tEl^T|Xci)-tc]+dii-<loi + 

^y,/f^21-^ll^f=l^^l-cl'y'-^c^^l^ll>'"l^y|-cl'^y = °' 

With mi(y|xci) being the posterior probability function of Y 
given Z = Zci- Now, solving the system of Equations 11 and 12 
for and y^i, one obtains the optimal cutting scores x'^i 
and y'ci- 

Linear Regression 

For given regression functions and probability density 
functions, the optimal quota-free decision strategy is 
represented by the system of Equations 11 and 12. If the 
monotonicity conditions are not strict or it does not hold 
that Si(y) or q^ix) > 0 in the neighborhood of the solution, 
the optimal decision strategy may not be unique. Throughout 
this paper it will be assumed that conditions like these are 
fulfilled. 

Since the relations between the test scores and the criterion 
(true score) in the regression functions are not directly 
observable, psychometric models are needed to estimate these 
relations. Possible psychometric models are the linear 
regression functions Oi+r^x and ai+PiX+T^y for Ei(T|x) and 
Ei(T|x,y), respectively. Since the probability functions of 
T given X=x, and T given X=x and y=y in subpopulation i are 
normal (see e.g., Johnson & Eotz, 1970), it follows that they 
belong to the exponential family, and, hence, they do possess 
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the property of MLR and MLR in each of its arguments. 

respectively (see e.g.. Cnuang. Chen. & Novick. 1981). In 

d d 
addition to the properties — EKTlx) = n. — EKTIx.y) = 
3 dx ' dx ' 

Pi. and — Ei(T|x.y)= ti > 0 (see e.g., Lord & Novick. 1968). 
dy 

it then follows that the monotonicity conditions are 
fulfilled. Assuming linear regression, it can be shown from 
classical test theory that the linear regression of T on X is 
given by 



(13) E^(T|x) = E^(Y|x) = My^i + Pi^^Y.i/^X.i^^^^^'x.i^ • 

^T.i' ^X i' ^i' ^y i' ^X i ^^^9 population means of 
and X^. the population correlation between X^ and Y^. 
and the population standard deviations of and X^. 

respectively. From (13), it follows that 

(14) = Pi«^y,i/<^x,i) 

^ = My,i - r^Mx.i- 

Furthermore, using results from classical test theory, 
it can be shown that the linear regression of T on X and Y 
ccm be written as 

(15) E^(T|x.y) = 

^'Y.i * ^^Y.i/^X.i^^^Pi-PlY'.iPi'/^^-^i^^^ 
^^"►'X.i^ * ^^PYY'.i~Pi^PYY'.i^/^^~Pi^^^^y->'Y.i^' 
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PYYM being the reliability coefficient of Yi. From (15). it 
follows cnat 

(16) = «^r.l/<^X.iH(Pi-Pyy.,iPi)/(l-Pi2jj 

Ti = <PYT..i-Pi/Pyy..i)/(l-Pi2) 
«1 =->'X.lPl * J^Y. 1^1-^5 • 

All quantities appearing In (14) and (16) can be estimated 
straightforward; thus, estimates of the linear regression 
functions can. be calculated. 

Iterative Solution in c««e of the Blvartate WftT-mni ^ ^fti 
In order to solve the system of Equations li and 12 for x^i 
and Yd' the decision-maker must specify the Joint 
probability function of X and Y. It Is assumed that the 
variables X and Y have possibly different blvarlate normal 
distributions In each subpopulatlon. Assuming that the X and 
Y scores are In their standardized form, this can be written 
as 

(17) k^t^H'^N^ = 

[2ir/l-p^23-l exp[-(x/-2p^x^yj^yj,2)/(2(l-p^2jj3 
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where and JH denote the standardized scores (z-mx^/^Z 
(y--My)/<^y of ^ and Y, respectively. For the standardized 
bivariate normal distribution in (17), the conditional 
distribution of given = is normal with expected 
value p^y^^ and variance (1-p^^). Likewise, the distribution 
of Y^ given 1^ = is normal with expected value Pj^^ and 
variance (1--Pi2). 

Substituting + p^x + '''^7^1 ^^^i^N ci' ^"^i^^ ^^^^ 
Equation 11 for E^(T|x,y^^) and Zi^^jjlyu ^i^' respectively, 
and using the property that the primitive function of xe~^^* 
is equal to --e~^^*, it follows that Equation 11 will take the 
form 

(18) f(^-.fci) = 

{(b2i-^li)(Oi+piMx,i+'^iyci+Pi<^X,iPiyN,ci"^cJ + 

^21-<ili><^X.i<l^[(^N.ci"PiyN.ci>/>^(l-Pi2)]> + 
(b2i«bii)Pi<rx^2^(1^^2) 

V[(^N.ci"PiyN.ci)/^(l-Pi2)] = 0. 

where $[.] and q)[.] denote the standard normal distribution 
function anl the standard normal density, respectively. 

Similarly, substituting + r^Xd. Oi + PiXd + T^y, 
and N(p^Xu^^^. 1-p 2^ into Equation 12 for E^(T|x^^). 
E^(T|x^^.y). and ^^(yulxu ^j.). respectively, results in 
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(19) gt^ci'^ci^ = 

^ ^ *^Oi+*^l i ^ ^ » i+^iXci-t^ ) -Mil i-<io i ) + { (b2i-bi i ) 

(«i+Pi^ci+^i*^Y.i+^iPi<^Y.i^N.ci-^c5^2i-<ili) 
(l-»[(yN.ci-^iXN.ci)/>^(l-^i2)])*(b2i^bii) 

T^o, i/(l^i2)<p[(y^^^^^^3Ci,^^i)//(l-^i2)] = 0 



The system of Equations 18 and 19 cannot be solved 
analytically for and Jci' ^ solved 

iter&tively using Newton's method for systems of nonlinear 
equations (see, e.g., Ortega & Rheinholdt, 1970). Updated 
estimates x'ci,j+l and y'ci,J+l after iteration J+1 ar'j 
obtained using the following formulas: 



(20) x'ci.j+1 = x'ci.j - [(f kJ- g - g 5^ f)/J(f .g)] 

^ci ^ci 

d d 

y'ci.j+1 = y'ci.j " [(g 5— f - f 5; — g)/J(f.g)]. 

^ci *^ci 



where J(f ,g) = 3- — f 5-- — g - 5- — g 5 f represents the 

^ci ^ci ^ci ^^ci 
Jacobian of the functions f(xci.yci) and g(xci.yci). It is 

recommended that the cut— off score t^ on the true score scale 

T is used as a first approximation to z'^j^ and y'ci* 

In order to solve the nonlire^j system of Equations 18 

and 19 via the iterative procedure given in (20), the partial 

derivatives of f(Xci.yci) and g(Xci.yci) are needed. They are 

given as 
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> sr7^^'ci-yci> = 

cl 

-[/(l-Pi2)]-l9((x^^^-pj^^^)//(l-p^2)] 
{ (b2i-*>i i) (ai+PiXci+Tiyci-tc)+d2i-ciii> . 

(22) g^fCx^i-Yci) = 

(^21^11 ^<^X , l/<^Y , 1 ^ ^ "^l^^Y , l+PlPl<^X , 1 ^ 

(1-»[(xn cl-P^y^ cp/^^l-^'l^^^^* 

Pi/ (/(1-p ^2 ) ] 9 [ ( ^,i-p iY N , cl> 1-P l2 ) ] 

(ai+PiX^ji+T^y^j^-tj,) >+(d2i-dii)pi/ [/(l-pi2)] 

<^X . 1<P I ^ *N . cl-<'iyN . cl > 1^ > J /<^Y . 1- 

(23) s^gCx^i-Yci) = 

ci 

^^01*^11^^1* 

« 1-* I ^^N . cl-<'l*N . ci> 1-P 1^ > ^ ^ + 

Pi/(/(l-Pi2)]9[(yjj^i-PiXN cl>/^^l-Pl^>J 
(ai+PiXjji+Tiy^ji-tj, ) > + (d2i-di ^ )pi/ [/(1-p ^2 ) ] 

V t ^T^N . cl-<'l*N . cl > 1^ > J /<^X . 1 ■ 



(24) g(x^,.y^,) = 

* ci 

- ( /(1-p i2 ) ] -I9 ( ( y^ jji-p iXN , cl) / ^n-P l2 ) ] 
{(b2i-bii)(ai+PiXci+Tiyci-tc)+d2i-dii>/ffy_i. 
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An interesting special case of the combined linear 
utility function arises when dgi = d;^^ = d2i. In that case, 
all utility parameters dj^ (J = 0,1,2) vanish from Equations 
18 and 19. In other wori;>, if the amount of constant utility, 
djj^, for each action is equal, then there is no need to 
choose values for dj^ in determining the opt:4.mal cutting 
scores x'^i y'ci- 

It can be shown (Lord & Novick, 1968, sect. 17.2) that 
the standard normal distributions appearing in (18)-(24) are 
almost interchangeable with logistic functions for a scale 
parameter equal to 1.7. The logistic model will be preferred 
in the iterative procedure because it is easier to work with 
mathematically than the standard normal model. Usiug this 
approi^imation, we may rewrite the standard normal 
distributions as follows: 

*I(»N.ci-PiyN.ci5/^(l-Pi2)] = 

(l+eip{-1.7(xjj^^-p^yy^^)//(l-Pi2)}]-l. 

*[(yN.ci-Pi»N.ci5/^(l-Pi')] = 

[l+exp{-1.7(yjj^^-p^Xj,^^)//(i-Pi2)}]-l. 

The iterative procedure is iinplefflented in a computer program 
called NEWTON. 
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Special Solutions 

The optimal solution for the separate mastery and selection 
decision can joth easily be derived from Equations 11 and 12 
by imposing certain restrictions on x^i and y^i. 
respectively. 

First, putting x^i = — In Equation 11, that is 
accepting all students, and using J** zi(x|yci)dx = 1, 
Equation 11 will take the form 

(25) [b2i-bii][Ei(T|yci)-tc]+d2i-dii = 0. 

Putting both utility lines uj^Ct) and U2i(t) in Formula 3 
equal to each other, it appears that the t coordinate of the 
intersection, ti2,i, is equal to tc+(dii^2i )/(b2i-bli) . 
which implies (25) can be replaced by Ei(T|yci) = i- This 
solution yields the same optimal cutting score y'^i as the 
one given by van der Linden and Mellenbergh (1977) for the 
separate mastery decision. Analogous to the combined decision 
proWem, a psychometric model is needed to specify the 
regression function Ei(T|yci). For this purpose, the 
classical test model with linear regression (Lord & Novick, 
1968, p. 55) will be assumed, which is known as Kelley's 
regression line: 

(26) E^CTly^^) = p^. ^y^^ + ^^-^YY^ .i^i^Y.i- 
Substituting (26) into (25), gives 
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(27) y'ci = MY.l + <^c-MY.l+ 'll-<i2l)/(*>21-*>ll) >/PYYM • 

Analogous to the derivation of the optimal separate 
mastery decision from ( 11 ) . the optimal separate selection 
decision can be derived from (12) by putting yc^ = that 
Is. advancing all accepted students. Doing so. and using 
Formula 3. It follows that 

(28) Ei(T|xci) = to2.i = tc + (doi-<i2i)/(boi+*>2l) • 

where to2 i denotes the t coordinate of the Intersection of 
utility line uoj^(t) with U2i(t). Also, this optimal solution 
Is the same as the one reached by Mellenbergh and van der 
Linden (1981) for the separate selection decision. Adopting 
the linear regression function from classical test theory 
again. It follows from (28) that the optimal cutting score 
x'qi can be expressed In closed form as 

(29) x'cl = MX.I + <tc"Mx.l+(<i01-<*21>/(*>01+^2l)>/PXXM» 

where pxx' l denotes the reliability coefficient of Xi. 

An Interesting case arises when di^^ = d2i In Equation 
27. Whenever this occurs, both utility lines u^j^(t) and 
U2i(t) Intersect at t^^. and thus. Equation 27 takes the form 

(30) y'ci = Ml.i + (tc-Ml.i)/PYYM- 
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In other words, if the ^-mounts of constant utility associated 
with the actions retaining and advancing a student in the 
separate mastery decision are ecpial or there are no constant 
utilities at all. then there is no need to assess the 
parameters b^i and ^21- numerical example, this 

situation will be further elaborated. 

Similarly, all utility function parameters vanish from 
(29) whenever doi = d2i: thus, (29) can be further simplified 
to 

(31) x'ci = MX,i + «^c-Mx.i)/PXX',i- 

Optimal Cutting Scores 
for Quota-Restricted Selection 

In quota-restricted selection only a fixed niunber of students 
can be accepted for the instructional program. The selection 
constraint can be expressed as 

(32) PO = 2 PiIProb(X ^ XnO] = E Pi IJ (x)dx] . 

i=l i=l 3Cci i 

g 

where 0<po<£pj^ = l represents the fixed proportion of 
all students that can be accepted. 

The values of x'^^ and y'c^ optimizing the overall 
expected utility of the combined decision problem, can now be 
found by introducing the selection constraint into the 

Fi2 
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function to be optimized (Equation 9) through a Lagrange 
multiplier X.: 



g 

(33) LC^ci'^ci'^^ =^^jPi^f"i^^l*ci'yci^^ 



+ X{p^-E p. [J q.(x)dx] ) . 

••^i=l ^ *ci ^ 



where X is a constant . 

Differentiating Kz^i.TciA) with respect to yci and 
Zci> setting the resulting expressions equal to zero, and 
using Pi > 0, yields 



(34) E[ui(T|xci.yci)] = 0. 

^ci 



(35) ^ E[u^(T|x^^.y^^)] + Aq^(x^^) = 0. 
ci 



As can be noticed from Equation 10. the solution to Equation 
34 is the same as the solution for the case of quota-free 
selection given by Equation 18. 

The first term in the left-hand side of Equation 35 
represents the derivative of the expected utility of a random 
student with respect to z^j^ in the unrestricted situation 
(Equation 12) . Substituting this partial derivative into 
Equation 35. and using q^Cz) 2: 0. it follows that 
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(36) tboi+bji] (Ei(T|xci)-tc]+<iii-doi-X + 

/ {[bgi-bj^] (Ei(T|Xci.y)-tc]+d2i-dii)mi(y|Xci)dY = 0. 
Yd 

Inserting the expressions for the linear regression functions 
and N(p^Xjj^^. 1-Pj^2) for n>i(yu|xu ^i^ • Integrating 
Equation 36, results In 

(37) Mx^^.y^^) = 

{ (boi+bj i ) ( ei+r^x^ji-t^ )+di i-doi-X) + { (bji-bj i ) 
<«l+Pl*cl+TiMy.i+TiPi<ry iXN^i-t^)+d2i-dii) 

(l-»((yN.cl-Pl»N.cl>/^(l-Pl2)]) + (b21^11> 
T^Cy i/(l-Pi2)<p((y^^^-p^XN.ci)//(l-Pi2)] = 0. 

Since it has been assumed that the joint distribution of X 
and T is a possibly different bivariate normal distribution 
in each population, it follows that qj^(z) is a normal 
distribution with mean Mx.i variance (^x,!^ (see. e.g.. 
Johnson & Kotz. 1972). Hence, the restriction of Equation 32 
can be written as 

g 

(38) v(xci.Xc2 Xcq) = 2 Pi{l^(xi|,ci^ > - PO = 0. 

^ i=l 

Now. the solution for the quota-restricted selection model is 
found by solving the system of Equations 18. 37. and 38 for 
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the (2g + 1) unknown parameters Yci* ^- Note that 

with quota-restricted selection, unlike quota-free selection. 

the optimal cutting scores x'^i and y'^i (i = 1 g) are 

dependent upon each other. 

In order to apply Newton's iterative method to solve the 
given system of nonlinear Equations, the partial derivatives 
are required again. From (37) and (19), it can easily be 
verified that h(xci.yci) = g(xci.yci), 

h(xci,yci) = g(xci,yci). and -gj^ h(xci.yci) = -1. 
The derivatives of f(xci.yci) and g(xci.yci) with respect to 

and y^i were given in Equations 21 until 24, 
respectively. Furthermore, it follows from (38) that 

Analogous to the quota-free selection model, can 
easily be seen from Equations 18, 37, and 38, no values for 
the utility function parameters dj^ (J = 0,1,2) have to be 
specified when the amount of constant utility for each action 
is equal. 

A cooiputer program called LACtRANGIE has been written to 
obtain an optimal decision rule. In the program, the optimal 
solution (x'ci.y'ciJ quota-free selection model can be 

used a8 a first 'approximation in the iterative procedure. A 
numerical example illustrating the procedure is given in the 
next section. 
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A Numerical Example 

Tho linear utility model for optimal select ion-^nastery 
decisiona was applied to a sample of 43 freshmen in medicine. 
Both the selection and mastery tests consisted of 17 free- 
response items on elementary medical knowledge with test 
scores ranging from 0-100. The treatment consisted of a 
cooputer-aided instructional (CAI) program. 

Due to prior knowledge, the total population of 43 
students could be distinguished with respect to elementary 
medical knowledge into a disadvantaged and an advantaged 
subpopulation of 27 and 16 students, respectively. Let the 
disadvantaged and advantaged population be referred to as 
subpopulation 1 and 2, respectively. The normal models 
assumed for the distributions Zj^ and Tj^ showed a satisfactory 
fit to the test data for a Kolmogorov-Smirnov goodness— of-fit 
test with p-values of 0.869, 0.934, 0.867, and 0.993 for Xj[ . 
Ij, I2. and I2, respectively. The differences between the 
theoretical and observed cumulative distribution functions 
were d.0686, 0.1035, 0.1495, and 0.1067, respectively. 

The teachers of the course considered students as having 
mastered the subject matter if their test scores were at 
least 55. Therefore, t^ was fixed at 55. 

The means, standard deviations and correlation between Z 
and T, were computed for each subpopulation using the maximum 
likelihood estimates. Furthermore, the reliabilities of the 
test scores were computed as coefficient a (Cronbach, 1951) 
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for each subpopulation. The results of these computations are 
shown in Table 1. 



Insert Table 1 about here 



First, the quota^free situation is considered. Since 
population 2 was considered more advantaged than 1. it should 
hold that bgi > bo2. b2i > b22» bn > bi2 ^11 • ^12 ^ 
and bii < for b^, bi2 0- Besides these conditions for 
the utility parameters in Formula 3, condition (6) should 
hold for the utility parameters bj^ (J = 0,1,2; i = 1,2). 
Substituting the values of the statistics of Table 1 into 
Equations 14 and 16, and using the coooputer program NEWTON, 
Equations 18 and 19 were then solved for Xd and y^i (i = 
1.2) with t^ as starting values. To illustrate the dependence 
of the results on the utility structure, optimal cutting 
scores were computed for 10 different values of the utility 
parameters bj^ and dj^ (J = 0,1,2; i = 1,2). The absolute 
values of bjj^ and djj^ for utility function 1 until 5 were the 
same as the absolute values for utility function 6 until 10. 
However, the sign of b^^j^ was taken negative in the last five 
runs, taking into account the fact that the sign of b^^^ could 
not be specified beforehand. The results are reported in 
Table 2. 
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Insert Table 2 about here 



As can be seen from Table 2. the consequence of Increasing 
the parameters bgj^ and (1 = 1.2) Is a decrease of the 
cutting scores. Furthermore. Inspection of Table 2 shows that 
a decrease of the amount of constant utility, djj^ (j = 0.1.2; 
1 = 1.2). Implies that the cutting scores have to be raised. 

Finally. It can be seen that for the simultaneous 
solution the optimal selection scores are lower for the 
disadvantaged than for the advantaged group. Conversely, the 
optimal mastery scores are lower for the advantaged group. 
This Is an Important conclusion, which can be argued by the 
fact that the disadvantaged students should be accepted 
sooner. On the other hand, however, they should stay longer 
In the treatment to be sure that they have mastered the 
Instructional unit sufficiently so that they may proceed with 
the next unit. 

Using Equations 27 and 29. the optimal cutting scores 
were also computed for the separate mastery and selection 
decisions. Since no constant amounts of utility were assumed 
for utility functions 2 and 7. Equations 30 and 31 were used 
to compute the optimal cutting scores for these two utility 
specifications. The results are also reported in Table 2. 
Table 2 shows that the optimal selection as well as the 
optimal mastery scores In the separate model have been raised 
cooopared to those In the combined m^del. 
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To give an impression of the gain in overall expected 
utility b7 using the simultaneous approach, the ratio of 
overall expected utility for the separate and simultaneous 
solution has been calculated. The overall expected utilities 
have been calculated by substituting the optimal cutting 
scores from Table 2 into Equation 8. The third term in the 
right-^and side of (8) has been coaqputed by using numerical 
integration methods, while the first and second term have 
been integrated analytically yielding respectively 

boi(tc-ei-riMx.i)+<*Oi. 

{ (boi +bi i ) ( e i+FiMx . i-^c ) +dl i-^Oi > 
<l^[»N.cin+<Tx.iri(boi+bii)9[x^^^^] . 

The computer program UTILITY calculates the overall expected 
utility; the results are displayed in Table 2. 

Finally, the quota-restricted situation is considered. 
The proportions pj^ (i = 1.2) of the student population 
belonging to each subpopulation were set estimated as nj^/n. 
where n represents the total sample size €uid nj^ represents 
the number of students in the sample in subpopulation i. The 
proportion pq of the total student population that could be 
accepted for the instructional treatment was arbitrarily set 
equal to 0.333. Using the computer program LAGRANGE, the 
system of Equations 18. 37. and 38 were then solved for x^^j^ 
^ Yd = 1*2) with the optimal solution of the quota-free 
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situation as starting values. The optimal cutting scores were 
computed again for 10 different values of b^j^ and d^^i the 
results are shown in Table 3. 



Insert Table 3 about here 



From Table 3 it can be seen that the optimal selection 
scores z'^.^^ and x* q2 quota-restricted model have to be 

raised compared to those in the quota-free model. This result 
is in accordance with our expectations, because fewer 
students can be accepted in the restricted situation. Also, 
it follows from Table 3 that the optimal mastery scores j'd 
and ^® higher than for the quota-free model. 

Finally, it can be noticed that, analogous to the quotas 
free situation, the optimal selection and mastery scores are 
lower and higher for the disadvantaged than for the 
advantaged group, respectively. 

Discussion 

In this paper an approach to instructional decision making 
for combinations of elementary decisions has been presented. 
A useful application of simultaneous decision making can be 
found in the area of instructional decision making in ISS's. 
As an example, two elementary decisions (viz. a selection and 
a mastery decision) were combined into a simple ISS to 
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indicate how by simultaneous optimization of such networks, 
optimal rules for proceeding students through ISS's can be 
designed within a Bayesian decision-theoretic framework. The 
utility structure adopted in this combined decision problem 
was a linear utility function. 

Further examination of the •'best" way to represent more 
complicated instructional networks of combinations of 
eli^montary decisions seems a valuable line of research. Such 
instructional networks can also be formalized with the aid of 
Bayesian statistics and optimal rules for these simultaneous 
optimization problems can be found. 

Also, more efforts are needed to examine other more 
realistic forms utility functions might take in certain 
educational applications. For example, the normal ogive 
utility function (Novick & Lindley, 1978) which takes utility 
to be a nonlinear function of the true score. Such a utility 
structure mi^ht be adopted, for instance, when it is 
reasonable to assume a leveling-off effect. 

Finally, an interesting line of research seems to be to 
design "optimal CAI-4ifet%jiks" using the method of simulation. 
On the basis of the derived theoretical optimal decision 
rules, then, for a simulated distribution of students, it can 
be determined in which instructional network the shortest 
time is spent to reach a certain final mastery level. 
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Author's Note 



Portions of this paper were presented at the European Meeting 
of the Psychometrxc Society, 1987. Enschede. The Netherlands. 
The author Is Indebted to Wim J. van der Linden for his 
valuable comments on earlier drafts of the paper and to Jan 
3ulmans for providing the data for the empirical example. 
The computer programs NEWTON. LAGRAT^"* and UTILITY are 
available on request from the author. 
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Table 1 

Statistics Selection and Master y Tests (X and 



Statistic Disadvantaged Advantaged 

XT XI 



Mean 


50 


.875 


62 


.626 


56 


.453 


67 


.148 


Sttuidard Deviation 


10 


981 


11 


645 


11 


674 


13 


344 


Reliability 


0 


762 


0 


775 


0 


783 


0 


813 


Correlation 




0. 


8564 






0. 


8685 
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Figure Captions 



Figure 1. Example of an individualized study system 

Figure 2. A system of one selection and one mastery decision 

Figure 3. Exainple of a linear utility function for a 
selection-mastery decision (b^^ > 0) 
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